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B lymphocytes use somatic hypermutation (SHM) to optimize 
immunoglobulins. Although SHM can rescue single point muta- 
tions deliberately introduced into nonimmunoglobulin genes, such 
experiments do not show whether SHM can efficiently evolve 
challenging novel phenotypes requiring multiple unforeseeable 
mutations in nonantibody proteins. We have now iterated SHM 
over 23 rounds of fluorescence-activated cell sorting to create 
monomeric red fluorescent proteins with increased photostability 
and far-red emissions (e.g., 649 nm), surpassing the best efforts of 
structure-based design. SHM offers a strategy to evolve nonanti- 
body proteins with desirable properties for which a high-through- 
put selection or viable single-cell screen can be devised. 

directed evolution | mPlum | Ramos | red fluorescent protein 

Directed protein evolution is one of the most powerful tools 
to engineer new protein properties not found in natural 
proteins (1, 2). To search protein sequence space within weeks 
or months rather than millennia or millions of years for natural 
selection, large protein diversities need to be iteratively gener- 
ated and screened very rapidly and efficiently. In vitro methods 
for creating genetic diversity are very powerful but laborious to 
apply iteratively when screening has to be done on transfected 
cells or organisms. Each cycle requires generation of a huge 
number of different mutant genes, transfection into cells (ideally 
so that each cell receives at most one mutant), screening for 
improved phcnolypc, and amplification, recovery, and sequenc- 
ing of the DNA encoding the best performers. Mutagenesis in 
intact living cells would avoid repetitive transfection and reiso- 
lation of genes, but existing methods normally randomize the 
entire genome wastefully and often deleteriously rather than 
focusing on the gene of interest (3). If mammalian cells could 
autonomously diversify arbitrarily chosen target genes, one 
could evolve proteins in situ and explore much larger sequence 
spaces for protein engineering and functional studies. 

Genetic information is naturally maintained in high fidelity in 
most cell types. However, when activated by antigens, B lym- 
phocytes in the immune system can specifically mutate Igs 
through a process called somatic hypermutation (SUM) (4-8). 
SHM uses activation-induced cytidine deaminase (AID) and 
error-prone DNA repair to introduce point mutations into the 
rearranged V regions of Ig at a rate of «4 X 10 ~ 3 mutations per 
base pair per generation, 10 6 times higher than that in the rest 
of the genome (9). SHM can repair premature stop codons 
deliberately introduced in non-Ig genes, provided that they are 
transcribed at a high enough rate (7, 8). However, to revert a 
single fatal base pair in one step is a far more modest task than 
to find multiple subtle mutations creating a desirable phenotype 
never seen before. We demonstrate here that SHM could 
generate useful phenotypes from a foreign gene. The gene for a 
monomeric red fluorescent protein (mRFP), mRFP1.2 (10), was 
expressed in the Burkitt lymphoma Ramos, a human B cell line 
that hypermutates its Ig V genes constitutive ly during culture 
(11). mRFP mutants with enhanced photostability and far-red 
emissions were evolved through iterative SHM and fluores- 
cence-activated cell sorting (FACS). 



Materials and Methods 

Introduction of the mRFP1.2 Gene into Ramos Cells. The mRFP1.2 
gene was amplified with primer pair TW5 (5'-CGCGGATC- 
CGCCACCATGGTGAGCAAGGGC-3') and TW3 (5'- 
CCATCGATTTAGGCGCCGGTGGAGTGGCG-3'), di- 
gested with BamBI and CM, and ligated into a precut pCTNCX 
(Imgenex, San Diego) derivative retroviral vector, in which the 
cytomegalovirus (CM V) promoter was replaced with the induc- 
ible Tet-on promoter. The resultant plasmid, pCTT-mRFP, was 
cotransfected with pCT-Ampho (Imgenex) into HEK293 cells to 
make the retrovirus, which was subsequently used to infect 
Ramos cells [CRL-1596, American Type Culture Collection 
(ATCC)] together with another retrovirus harboring the reverse 
Tet-controlled transactivator. Ramos cells were grown in mod- 
ified RPMI medium 1640 as suggested by ATCC. Doxycycline (2 
/ig/ml) was added to induce the expression of mRFP 24 h before 
FACS, and infected cells were sorted for six rounds to enrich red 
fluorescent cells. In the initial sorting, <5% of cells became red, 
indicating a multiplicity of infection well below 1. 

Protein Evolution by FACS. Ratio sorting was applied to evolve 
mRFP mutants with red-shifted emissions. Ramos cells were 
excited at 568 nm, and two emission filters (660/40 and 615/40) 
were used. The ratio of intensity at 660 nm to that at 615 nm was 
plotted against the intensity at 660 nm. Cells with the highest 
ratio and sufficient intensity at 660 nm were collected (Fig. LB). 
Usually one million cells were collected each time, and they were 
grown in the absence of doxycycline until 24 h before the next 
round of sorting. 

Mutant Characterization. Sorted cells were amplified in the ab- 
sence of doxycycline, and 0.1 /xg/ml doxycycline was then added 
for 10 h. Total mRNA was extracted from these cells and used 
as template for RT-PCR to clone mRFP mutant DNA with 
primer pair pCL5 (5'-AGCTCGTTTAGTGAACCGTCA- 
GATC-3') and pCL3 (5'-GGTCTTTCATTCCCCCCTTTT- 
TCTGGAG-3'). These mutant mRFP genes were subcloned into 
a pBAD vector (Invitrogen) and expressed in Escherichia coli. A 
His-6 tag was added to the C terminus to facilitate protein 
purification using Ni-NTA chromatography (Oiagen, Valencia, 
CA). Spectroscopic measurements were as described previously 
(12), except that concentrations of mRFPs were determined by 
assuming an extinction coefficient after denaturation in 0.1 M 
NaOH of 44,000 M^-cm" 1 at 452 nm, the same value as that of 
similarly denatured Renilla GFP (13, 14). 

Photobleaching Measurements. Microdroplets of aqueous protein, 
pH 7.4, typically 5-10 p.m in diameter, were created on a 
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Fig. 1. Directed evolution of mRFP with red-shifted emission by SHM in 
Ramos cells. (A) Schematic illustration of the construct and evolutionary 
process. TetO/P minC MV,Tet operator/minimal CMV promoter. (6) Typical FACS 
criteria for ratiosorting. Ramos cells were excited at 568 nm, and two emission 
filters (660/60 and 61 5/40) were used. The ratio of intensity at 660 nm to that 
at 61 5 nm was plotted against the intensity at 660 nm. Cells with the highest 
ratio and sufficient intensity at 660 nm were collected. Usually 1-2 million cells 
were collected each time, and they were grown in the absence of doxycycline 
until 24 h before the next round of FACS. Cell populations from rounds 1,10, 
and 20 are shown in blue, green, and yellow, respectively. Collected cells are 
hignngntea inred.au. Arbitrary unit. (C) Fluorescence emission maxima of the 
Ramos cell population in each round, measured with a spectrofluorometric 
plate reader (Satire, Tecan, Maennedorf, Switzerland). 



microscope coverslip under mineral oil and bleached by using a 
Zeiss Axiovert 200 microscope at 14.3 W/cm 2 with a 75-W xenon 
lamp and a 540- to 595-nm excitation filter. Reproducible results 
required preextraction of the mineral oil with aqueous buffer 
shortly before microdroplet formation. 



Fig. 2. Evolution pathway of the mutants. {A) Nucleotides mutated by SHM 
in different rounds. Twenty random samples were sequenced in round 0, 8 in 
round 10, 8 in round 14, and 12 in round 23. (B) Amino acid mutations, 
quantum yields (QY), and extinction coefficients (EC) of different mutants. 
R10F5 represents mutant F5 from round 10. We named mutants R10D6 and 
R23H6 mRaspberry and mPlum, respectively. (C) Stereoview of mutation loci 
in mPlum based on the crystal structure of DsRed. The chromophore of RFP is 
shown in red. Residues are highlighted in yellow for emission-shift mutations 
and in gray for neutral mutations. 



identification of Integration Loci. The integration loci of provirus 
in the Ramos genome was determined by using reverse PCR as 
described (15), except that the secondary PCR products were 
directly sequenced without further cloning after agarose gel 
electrophoresis and purification. Ikmi\\\ and .S'r«f3AI were used 
to digest genomic DNA separately. The primary PCR primer 
pairs were as follows: TW131 (5'-GACAGCTTCAAG- 
TAGTCGGGGATG-3') and TW132 (5'-CTTCCCCG 
AGGGCTTCA AGTGGG-3 ' ) for Bam HI cloning; and TW128 
(5 '-CGA ACAGA AGCGAGA AGCGA AC-3 ' ) and LW129 (5'- 
CGCGCTTCTGCTCCCCGAGCTC-3') for Sau3Al cloning. 
The secondary PCR primer pairs were as follows: LW130 
(5'-TCGCCCTTGCTCACCATGGTGGC-3') and LW127 (5'- 
GCCAGTCCTCCGATTGACTGAGTC-3 ' ) for BamUI clon- 
ing; and LW126 (5'-CACCCTGGAAACATCTGATGGTTC- 
3') and LW127 for Saw3AI cloning. Sequences were compared 
with EST databases by using blastn (www.ncbi.nlm.nih.gov), 
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Fig. 3. Characterization of evolved mutant proteins. (A) Fluorescence spectra of purified parental mRFPI .2 protein and representative mutant proteins from 
different rounds. Black dot, mRFPI. 2; blue dash, mRaspberry; green dash dot, R14H4; red solid line, mPlum. In rounds 22 and 23, brighter cells were sorted while 
maintaining the ratio. Thus, mutants from rounds 21 and 23 have similar fluorescence spectra, except that round 23 mutants have larger extinction coefficients. 
All emission spectra were taken at the excitation wavelength 564 nm, and emission was monitored at 640 nm for excitation spectra. (B) Fluorescence intensity 
decay during photobleaching was at 14.3 W/cm 2 at 568 nm. Color code is as in A. 



and all matching sequences had blastn probability values 
of<l(T 61 . 

Results and Discussion 

SHM Introduces Mutations into an Exogenous, Non-lg Gene. The gene 
for mRFP1.2 (10) was expressed as a single copy in Ramos under 
the control of a doxycycline-inducible promoter, Tet-on (Fig. 
1A), so that SHM could be controlled by varying the transcrip- 
tion level. First, fluorescent cells were enriched by using six 
rounds of FACS of cells to which 2 /xg/ ml doxycycline was added 
24 h before each round to induce mRFP expression. A fluores- 
cent cell population was established with >96% cells fluores- 
cent. Sequencing of different clones revealed many mutations 
with features of SHM scattered throughout the target gene (Fig. 
2A, Round 0). Of 20 samples sequenced, 12 had 1-3 mutations. 
Starting from this fluorescent population, >15% of cells lost 
fluorescence when doxycycline was added for 120 h, whereas 
<5% lost fluorescence when doxycycline was present for only 
24 h, suggesting that more transcription generated more muta- 
tions. In control HEK293 cells lacking SHM, a similarly estab- 
lished fluorescent population did not change its fluorescence 
significantly upon such treatment. 

Evolution of mRFP Mutants with Far-Red-Shifted Emissions. We next 
tested whether an mRFP with red-shifted emission could be 
evolved directly in Ramos. The parental mRFP1.2 fluoresces 
with a peak at 612 nm. A longer wavelength emission would 
confer greater tissue penetration and spectral separation from 
autof luorescence and other fluorescent proteins. In each round 
of sorting, we collected =5% of the population having the 
highest ratio of 660-nm to 615-nm emissions yet maintaining at 
least a minimum brightness at the former (Fig. LB). Over 23 
rounds of sorting and regrowth, the emission maxima shifted to 
longer wavelengths in several steps (Fig. 1C). After each major 
step, mutant mRFP genes were isolated, sequenced (Fig. 2), and 
transferred to a standard bacterial expression system so that 
mutant proteins could be purified in larger quantities and 
characterized (Fig. 3/4). 

The mutant with the longest emission wavelength (dubbed 
"mPlum" in view of its monomeric nature, purplish appearance 
by reflected light, and deep red glow) peaked at 649-nm emis- 
sion, 37 nm longer than the peak of the original mRFP1.2 and 
12 nm beyond the previous furthest-red emitter, the tandem 
dimer t-HcRedl (16). The absorbance and excitation maxima of 
mPlum remained at 590 nm, surprisingly unchanged from those 



of mRFP1.2 and identical to those of t-HcRedl. The 59-nm 
Stokes' shift is unusually large. The fluorescence quantum yield 
of mPlum (0.10) is somewhat lower than that of mRFP1.2 (0.25) 
but still well above that of t-HcRed (0.04). The far-red emission 
of mPlum will be useful for improving optical imaging in intact 
mammals (17), especially of chimeric proteins and genetically 
encoded indicators (18) where it is essential that the fluorescent 
protein be monomeric. The largest wavelength of excitation (598 
nm), extinction coefficient (86,000 M~ 1> cm~ 1 ), and quantum 
yield (0.15) were found in a round 10 mutant, "mRaspberry," 
whose emission maximum was 625 nm. The times for 50% 
maturation of mPlum and mRaspberry are ~100 and «=55 min, 
respectively. 

Furthermore, all evolved mutants were considerably more 
resistant to photobleaching than the parental mRFPI. 2. When 
exposed to a 14.3-W/cm 2 beam at «*568-nm light on a micro- ^ 
scope stage, microdroplets of mPlum and mRaspberry under oil g 
took 80 and 14 s, respectively, to bleach to 50% of initial 2 § 
intensity, 30- and 5.2-fold longer than mRFP1.2 (Fig. 3B). The i | 
repeated FACS selection for cells exceeding a minimum bright- S vi 
ness might have promoted photostability by discriminating g 
against mutants that bleached significantly during the passage < 
through the intense laser excitation spot. 

Analysis of SHM Evolution Pathway. DNA sequences of these 
mutants revealed the evolution pathway. New mutations, includ- 
ing silent ones, were generated in each round (Fig. 2A), indi- 
cating that SHM does not stall but keeps exploring the sequence 
space. Within a round, different clones shared common muta- 
tions, such as F65C and I161M in round 10 (Fig. IB). Beneficial 
mutations were preserved from round to round, such as I161M 
and V16E. Although thymine is not favored for SHM (5), it was 
mutated to guanine or adenine in the F65 codon to generate 
cysteine and iso leucine, respectively, and to adenine in the VI 6 
codon to generate glutamic acid. These results indicate that 
beneficial mutations can be found although they may be rela- 
tively disfavored by SHM's known biases for particular base 
changes. 

Comparison of mutations with phenotypes indicates that 
alterations at positions 16 and 65 gave rise to the dramatic red 
shift of the emission peak, whereas mutations at positions 124 
and 161 mainly narrowed the emission width by shrinking the 
short -wavelength side of the peak. The latter is a subtle beneficial 
effect that is usually difficult to achieve. When mapped on the 
crystal structure of DsRed (19), from which mRFP was derived 
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(Fig. 2C), residue 65 just precedes the chromophore. Residues 16 
and 161 are located at opposite ends of the chromophore with 
side chains facing it. Residue 124 also faces inward, toward the 
helix bearing the chromophore. Mutation of these residues could 
directly perturb the chromophore's microenvironment, resulting 
in emission shift. In contrast, residues 17, 45, and 166 face away 
from the chromophore, and thus their major contribution is to 
improve protein folding and brightness. 

Parallel experiments using random mutagenesis or rational 
design based on crystal structure have not yet generated mRFP 
mutants with emission maxima of >632 nm, suggesting that 
SHM can solve challenging problems in global searching. So far, 
all wild-type members of this chromoprotein superfamily with 
absorbance maxima of >570 nm have been nonfluorescent 
tetramers rather than fluorescent monomers (20, 21), so a few 
months of SHM and FACS accomplished what billions of years 
of evolution in coral reefs apparently has not. 

More detailed evidence for the power of SHM is that tradi- 
tional in vitro saturation mutagenesis at each locus identified by 
SHM produced no further increase in emission wavelengths. 
Instead, most mutations resulted in either fluorescence loss or 
blue shift. For example, the emission spectra in Fig. A A and B 
show that SHM found the optimum substitutions at positions 65 
and 124, respectively. Saturation mutation results for positions 
16 and 17 (16/17) and positions 161 and 166 (161/166) are 
tabulated in Fig. AC. Furthermore, several residues such as T127 
were mutated in some but not all SHM clones. Saturation 
mutagenesis at these loci indicated that they are neutral; i.e., they 
do not affect emission wavelength (Fig. AB). In addition, satu- 
ration mutagenesis was performed on residues 16 and 65 simul- 
taneously to test whether a better combination exists to afford 
longer emission. Again, all mutational combinations different 
from that found by SUM led to fluorescence loss or blue shift. 
These results suggest that our method can identify and locally 
optimize critical residues to cope with selection pressure. 

Integration Loci of Target Gene in Ramos Genome. Whether SHM is 
locus-specific for the IgV gene is under debate. We determined 
the integration loci of the mRFP mutant gene in several repre- 
sentative rounds of FACS-selected cells. In round 2, the target 
gene was mainly found in chromosomes 5, 16, 18, and 20, which 
do not contain Ig genes. The multiple-site distribution is ex- 
pected because retrovirus integrates into host genome rather 
randomly. Mutations with the characteristics of SHM were found 
in the target gene as early as round 0 with high frequency (12 of 
20 sequenced clones had mutations). However, by round 23, 
when mPlum was isolated, only a single integration site was 
found, at the Ig heavy-chain locus in chromosome 14 between 
gene IgHV7-34-l and IgHV4-34. These results show that SHM 
can mutate exogenous genes integrated at many loci in the 
genome (8), but the mutation rate may be higher at the Ig loci. 
Therefore, desired properties requiring multiple mutations are 
more likely generated from a clone with the target gene inte- 
grated at Ig loci. In future applications, it would be more efficient 
to direct the target gene to such loci to jump-start the evolution. 
Such targeting should not be difficult, because universal gene 
replacement vectors working at Ig loci have been generated 
previously (22). 

Sequence Space Sampled by SHM. Although only seven mutations 
are needed to convert mRFP1.2 to mPlum, the sequence space 
that SHM has to sample to reach this final status presumably 
should be much larger, because many mutations, alone or 
combined, are deleterious to fluorescence intensity, emission, 
and protein folding. The 3' LTR of the retrovirus vector used to 
introduce the mRFP gene into the Ramos genome can serve as 
an internal reference for the cumulative effect of SHM unbiased 
by phenotypic selection. We sequenced the very end of the 3' 
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Fig. 4. Saturation mutagenesis analysis of positions identified by SHM in 
mPlum. (.4) Fluorescence emission spectra of mutants with different mutations 
at position 65. All mutations other than the SHM-identified isoleucine dra- 
matically blue-shift the emission. (B) Fluorescence emission spectra of mutants 
with different mutations at positions 124 and 127. Mutations at position 124 
other than the SHM-identified valine broaden the emission peak to the 
short-wavelength side. Regardless of the mutations at position 127, mutants 
with leucine or cysteine at position 124 overlap, and mutants with valine at 
position 124 also overlap. (C) Fluorescence emission peaks of mPlum mutants 
with different mutations at positions 16 and 17 and positions 161 and 166. For 
saturation mutagenesis at positions 16 and 17, representative mutants with 
emission spanning from 614 to 649 nm were sequenced. For saturation 
mutagenesis at positions 161 and 166, only mutants with emissions of >640 
nm were sequenced. Of 1 5 sequenced samples, methionine/lysine was found 
in 11 clones and serine/arginine in 2 clones. 



LTR in mPlum and two randomly picked clones in round 2 and 
compared them with the original sequence in the parental vector 
(Fig. 5). Among 117 sequenced nucleotides, the 3' LTR in 
mPlum has 15 mutations including an insertion, whereas there is 
only one mutation in two clones from round 2. The high 
mutation frequency indicates that SHM indeed samples a large 
sequence space. 

Conclusion 

SHM of non-Ig genes is no longer limited to repairing artificial 
defects but can accumulate multiple reinforcing mutations 
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; alignments of the end of 3' LTR from the retroviral vector, two clones of round 2, and mPlum. Mutations are shaded in black. 



throughout the gene to produce new and desirable phenotypes 
difficult or impossible to find by conventional mutagenesis. 
SHM-mediated protein evolution in live cells obviates labor- 
intensive in vitro mutagenesis and screening, samples a large 
protein space, and directly links genotypes to cell phenotypes. 
An engineered error-prone DNA polymerase I can perform 
somewhat analogous targeted mutagenesis on multicopy ( 'oil . 1 
plasmids in bacteria (23), but SHM works on single-copy inte- 
grants in well established mammalian cell lines, which are 
indispensable for the study of many eukaryotic proteins such as 
therapeutic targets. SHM should provide a general strategy to 
iteratively accumulate multiple desirable mutations in many 
other proteins whose function can be robustly assessed by 



high-throughput selections and screens that leave the desired 
cells alive. Catalytic antibodies (24) have been the showcase for 
using the immune system to evolve functions remote from 
immunology, but the repertoire of useful B cell creativity has 
now further expanded to proteins unrelated to antibodies. 
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